Performance Analysis of Clustering in Privacy Preserving Data Mining

نویسنده

  • Bipul Roy
چکیده

Privacy is becoming an increasingly important issue in many data mining applications. This has triggered the development of many privacy preserving data mining techniques. A frequently used disclosure protection method is data perturbation. When used for data mining, it is desirable that perturbation preserves statistical relationships between attributes, while providing adequate protection for individual confidential data. Existing perturbation methods typically require that the statistical properties of the data can be specified with known distributions. We propose a tree-based perturbation method that can be easily used for perturbing data with knowing the underlying distributions. Our method employs a kd-tree technique to recursively partition a dataset into smaller subsets such that data records within each subset are more homogeneous after each partition. Once the partitioning process is completed, the confidential data in each subset are perturbed using microaggregation. An experimental study shows that our proposed method outperforms additive and multiplicative noise perturbation methods for clustering applications.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A High Performance Privacy Preserving Clustering Approach in Distributed Networks

Privacy preserving over data mining in distributed networks is still an important research issue in the field of Knowledge and data engineering or community based clustering approaches, privacy is an important factor while datasets or data integrates from different data holders or players for mining. Secure mining of data is required in open network. In this paper we are proposing an efficient ...

متن کامل

Classification via Clustering for Anonym zed Data

Due to the exponential growth of hardware technology particularly in the field of electronic data storage media and processing such data, has raised serious issues related in order to protect the individual privacy like ethical, philosophical and legal. Data mining techniques are employed to ensure the privacy. Privacy Preserving Data Mining (PPDM) techniques aim at protecting the sensitive dat...

متن کامل

Privacy-preserving data mining in homogeneous collaborative clustering

Privacy concern has become an important issue in data mining. In this paper, a novel algorithm for privacy preserving in distributed environment using data clustering algorithm has been proposed. As demonstrated, the data is locally clustered and the encrypted aggregated information is transferred to the master site. This aggregated information consists of centroids of clusters along with their...

متن کامل

Privacy Preserving Clustering by Data Transformation

Despite its benefit in a wide range of applications, data mining techniques also have raised a number of ethical issues. Some such issues include those of privacy, data security, intellectual property rights, and many others. In this paper, we address the privacy problem against unauthorized secondary use of information. To do so, we introduce a family of geometric data transformation methods (...

متن کامل

CLUST-SVD: Privacy preserving clustering in singular value decomposition

Large repositories of data contain sensitive information that must be protected against unauthorized access. The protection of the confidentiality of this information has been a long-term goal for the database security research community and for the government statistical agencies. Recent advances in data mining and machine learning algorithms have increased the disclosure risks that one may en...

متن کامل

A Model Based Framework for Privacy Preserving Clustering Using SOM

Privacy has become an important issue in the progress of data mining techniques. Many laws are being enacted in various countries to protect the privacy of data. This privacy concern has been addressed by developing data mining techniques under a framework called privacy preserving data mining. Presently there are two main approaches popularly used -data perturbation and secure multiparty compu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014